Stroke-model-based character extraction from gray-level document images
نویسندگان
چکیده
Global gray-level thresholding techniques such as Otsu's method, and local gray-level thresholding techniques such as edge-based segmentation or the adaptive thresholding method are powerful in extracting character objects from simple or slowly varying backgrounds. However, they are found to be insufficient when the backgrounds include sharply varying contours or fonts in different sizes. A stroke-model is proposed to depict the local features of character objects as double-edges in a predefined size. This model enables us to detect thin connected components selectively, while ignoring relatively large backgrounds that appear complex. Meanwhile, since the stroke width restriction is fully factored in, the proposed technique can be used to extract characters in predefined font sizes. To process large volumes of documents efficiently, a hybrid method is proposed for character extraction from various backgrounds. Using the measurement of class separability to differentiate images with simple backgrounds from those with complex backgrounds, the hybrid method can process documents with different backgrounds by applying the appropriate methods. Experiments on extracting handwriting from a check image, as well as machine-printed characters from scene images demonstrate the effectiveness of the proposed model.
منابع مشابه
Character Extraction from Gray Images Based on Mathematical Morphology
A morphology-based thresholding technique for character extraction from outdoor images is developed. Such extraction is traditionally difficult because of the influence of illumination changes, such as lightness, shadow, and reflection. Changes in character font, apparent size, and inclination also make extraction more difficult. Though there are some character reading systems for format and si...
متن کاملA Model of Stroke Extraction from Chinese Character Images
Given the large number and complexity of Chinese characters, pattern matching based on structural decomposition and analysis is believed to be necessary and essential to off-line character recognition. This paper proposes a new model of stroke extraction for Chinese characters. One problem for stroke extraction is how to extract primary strokes. Another major problem is to solve the segmentatio...
متن کاملAutomatic Logo Extraction from Document Images
Logo extraction plays an important role in logo based document image retrieval. Here we present a method for automatic logo extraction from the document images that works for scanned documents containing a logo. Proposed method uses morphological operations for logo extraction. It supports extraction of a logo with its gray level and color information
متن کاملCharacter Energy and Link Energy-Based Text Extraction in Scene Images
Extracting text objects from scene images is a challenging problem. In this paper, by investigating the properties of single characters and text objects, we propose a new text extraction approach for scene images. First, character energy is computed based on the similarity of stroke edges to detect candidate character regions, then link energy is calculated based on the spatial relationship and...
متن کاملAlgorithm for Fast Detection and Identification of Characters in Gray-level Images
This paper discusses methods for character extraction based on statistical and structural features of gray levels, and proposes a dynamic local contrast accommodating line width. Precision locating of character groups is realized by exploiting horizontal projection and character arrangements of binary images along horizontal and vertical directions respectively. Also discussed is the method for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
دوره 10 8 شماره
صفحات -
تاریخ انتشار 2001